Drug resistance profiling of asymptomatic and low-density Plasmodium falciparum malaria infections on Ngodhe island, Kenya, using custom dual-indexing next-generation sequencing

Malaria control initiatives require rapid and reliable methods for the detection and monitoring of molecular markers associated with antimalarial drug resistance in Plasmodium falciparum parasites. Ngodhe island, Kenya, presents a unique malaria profile, with lower P. falciparum incidence rates than the surrounding region, and a high proportion of sub-microscopic and low-density infections. Here, using custom dual-indexing and Illumina next generation sequencing, we generate resistance profiles on seventy asymptomatic and low-density P. falciparum infections from a mass drug administration program implemented on Ngodhe island between 2015 and 2016. Our assay encompasses established molecular markers on the Pfcrt, Pfmdr1, Pfdhps, Pfdhfr, and Pfk13 genes. Resistance markers for sulfadoxine-pyrimethamine were identified at high frequencies, including a quintuple mutant haplotype (Pfdhfr/Pfdhps: N51I, C59R, S108N/A437G, K540E) identified in 62.2% of isolates. The Pfdhps K540E biomarker, used to inform decision making for intermittent preventative treatment in pregnancy, was identified in 79.2% of isolates. Several variants on Pfmdr1, associated with reduced susceptibility to quinolones and lumefantrine, were also identified (Y184F 47.1%; D1246Y 16.0%; N86 98%). Overall, we have presented a low-cost and extendable approach that can provide timely genetic profiles to inform clinical and surveillance activities, especially in settings with abundant low-density infections, seeking malaria elimination.


Results
Illumina amplicon sequencing and coverage. In years 2015 and 2016, a high compliance (> 90% participation) MDA was carried out on Ngodhe island. Of the 201 PCR-positive dried blood spot (DBS) samples collected during the 2015/2016 MDA campaign on the island, 102 (53.7%) samples were deemed suitable for sequencing based on low DNA concentration measurements (Qubit HS DNA concentration > 0.4 ng/ul). From the 102 P. falciparum samples, 70 (68.7%) were successfully sequenced on an Illumina platform to cover a total of nine 500 base pair amplicons that encompassed five genes (Pfk13, Pfcrt, Pfmdr1, Pfdhps, and Pfdhfr) (Table S1; Fig. 1). The 70 sequenced samples were sourced from 49 (70.0%) sub-microscopic infections, 12 (17.1%) lowparasitaemia (< 1000 parasites/µl blood) and 9 (12.8%) moderate/high-parasitaemia infections (> 1000 parasites/ µl blood). Rapid diagnostic test (RDT) data was available for 24 of the sequenced samples, 5 with negative and 19 positive RDT results. Of the 5 samples that were negative by RDT but positive by PCR, 4 were also negative by microscopy. Individuals were classified as asymptomatic with an absence of fever or other acute symptoms 4 .
The median depth of read coverage achieved via amplicon sequencing overall was high (range of medians: 69-to 1760-fold; Table S1), with Pfk13 (1760-fold) having noticeably higher average coverage than Pfcrt, Pfmdr1, Pfdhps, and Pfdhfr (69-to 155-fold) (Table S1; Fig. 2). In general, regional coverage of 20-fold or higher is SNP frequencies on drug resistance-associated genes using Illumina sequencing. Across the 70 samples sequenced using the Illumina platform, only SNPs with a minimum read depth > tenfold on Pfcrt, Pfmdr1, Pfk13, Pfdhps, and Pfdhfr were included in this analysis to ensure accuracy in SNP reporting. For context and comparison, variant frequencies across these genes were compared to previously categorised frequencies from parasites on Mfangano island, Lake Victoria, as well as East and West African isolates available via the Pf3k database (Table 1) 10 . There were 3 nonsynonymous mutations observed on the Pfcrt gene, resulting in amino acid changes M74I, N75E, and K76T. These variants are well documented to be associated with resistance to chloroquine. The Pfcrt N74I polymorphism occurred in 3 out of 46 samples while the N75E and K76T polymorphisms were observed in 2 out of 45 samples. The Pfcrt K76T biomarker, the primary marker for chloroquine resistance, was observed in 4.4% of the Ngodhe samples, compared to 7.4% of isolates from Mfangano island in 2020 and 17.5% from wider East African populations 10 . The Pfcrt C72S and V73V variants were not observed. Analysis of haplotype frequencies on Pfcrt for samples with a read depth > tenfold at every position (codons 72, 73, 74, 75, and 76) identified the distributions of mutants and wild type parasites in the Ngodhe parasite population (n = 45) ( Table 2). The wild-type parasites, CVMNK, made up 95.5% of the population while the triple mutant, CVIET (mutations underlined), accounted for 4.5% of the population.
On the Pfmdr1 gene, 7 nonsynonymous SNPs were identified, 3 of which are reported to cause amino acid changes resulting in altered susceptibility to chloroquine and the ACT partner drug lumefantrine (N86Y, Y184F and D1246Y). The Pfmdr1 N86Y variant, where the wild-type (WT) allele is believed to be selected for by artemether-lumefantrine usage, was identified in 1 out of 50 samples (2.0%), appearing at similar frequencies to Mfangano island isolates (0%) 10 . The D1246Y mutation was observed in 5 out of 31 samples (16.0%), appearing at a higher frequency compared to Mfangano isolates (2.9%). Frequencies in East African isolates were higher for Pfmdr1 N86Y, 15.7%, while the N1246Y biomarker appeared at a similar frequency to Ngodhe isolates (12.9%). The Pfmdr1 Y184F biomarker was observed in 47.1% of samples (24 out of 51 isolates). This frequency was slightly higher than frequencies observed in Mfangano island (31.7%), and East African isolates (39.0%) 10 . The Pfmdr1 S1034C and N1042D variants were not observed in any of the 70 samples with read depth more than tenfold at this position.
Haplotype analysis of Pfmdr1 identified the frequencies of single and double mutants within the Ngodhe parasite populations based on codon positions 86, 184, 938, 1034,1042, and 1246 (n = 31) ( Table 2). Wild-type parasites, Pfmdr1 NYFSND, made up the largest fraction of the population, accounting for 45.2% of isolates, while the single mutant, NFFSND, made up 38.7%. The single mutant, Pfmdr1 NYFSNY, accounted for 9.7% of the population while the double mutant, NFFSNY, constituted the remaining isolates at 6.4%. No drug resistanceassociated polymorphisms were recorded on Pfk13. The Pfk13 variant A578S was recorded but is not thought to confer reduced susceptibility to artemisinin or any other antimalarial drugs.
There were 4 nonsynonymous SNPs identified on Pfdhfr, all of which resulting in amino acid changes documented to confer reduced susceptibility to pyrimethamine (N51I, C59R, S108N, and I164L). N51I, C59R, AND S108N changes were seen at similar frequencies amongst the Ngodhe isolates, with prevalence observed at   10 . The I164L variant was observed in 5.2% of Ngodhe isolates, similar to frequencies to Mfangano isolates (6.9%), and higher than isolates from East Africa (1.5%) and West Africa (0%). On Pfdhps, 3 nonsynonymous SNPs were observed, with 2 known to result in amino acid changes associated with resistance to sulphadoxine, A437G and K540E. The K540E mutant was observed in 79.2% of isolates and is generally used as a proxy measure for the presence of all 5 key mutations for resistance to SP (N51I, C59R, S108N, I164L, A437G, and K540E). The A437G mutation was observed to be fixed at 100% in samples from Ngodhe and Mfangano islands, compared to 84.4% in samples from wider East African populations.
Analysis of the Ngodhe parasite Pfdhfr/Pfdhps haplotype frequencies, for samples with a read depth more than tenfold at every nucleotide position, included Pfdhfr codons 51, 59, 108, and 164, as well as Pfdhps codons 437, 540, 581 (n = 37) ( Table 3). The quintuple mutant haplotype, IRNIGEA, was observed in a majority of screened Ngodhe isolates, accounting for 62.2% of the population, as well as a sextuple mutant haplotype, IRNLGEA, which was identified in 5.4% of isolates. The single mutant haplotype, NCSIGKA, was observed at the second highest frequency, 13.5%, while the wild-type haplotype, NCSIAKA, was not observed in any isolates. Double, triple, and quadruple mutants accounted for the remainder of the population, occurring at frequencies ranging from 2.7% to 5.4%. www.nature.com/scientificreports/  Table 3. Drug resistance haplotype frequencies on Pfdhfr/Pfdhps identified within the Ngodhe island P. falciparum parasite population (minimum read depth > tenfold). Amino acid changes in bold.

Discussion
Advancements in targeted low-cost sequencing approaches provide a viable means for monitoring the emergence and spread of characterised drug resistance-associated polymorphisms in both asymptomatic and low-density P. falciparum infections 28,31 . Asymptomatic infections tend to be under-represented in large-scale genetic analyses, despite accounting for most infections worldwide 32 . This is often due to the difficulties associated with extracting sufficient DNA needed to perform genomic characterisation, such as whole genome sequencing. Additionally, asymptomatic individuals do not tend to seek treatment and go undiagnosed. As countries make progress towards their malaria elimination goals, sub-microscopic and asymptomatic cases will remain the main reservoirs for disease 1 . To ensure accurate and informed disease control policies, molecular surveillance methods need to be sufficiently sensitive to detect regions of interest, such as drug resistance biomarkers, in these types of infections.
To assist in the generation of higher resolution drug resistance profiles of malaria parasites, we demonstrate the use of custom dual-indexing amplicon sequencing technology to identify molecular markers of resistance on Pfcrt, Pfmdr1, Pfdhps, Pfdhfr, and Pfk13 genes from asymptomatic and low-density P. falciparum infections on Ngodhe island in Lake Victoria, Kenya. Ngodhe island has a unique transmission profile compared to surrounding inhabited islands and mainland communities located within western Kenya and Lake Victoria. In a region of intense transmission, asymptomatic and sub-microscopic infections make up most cases detected on Ngodhe island, which also has an overall lower P. falciparum incidence 7 . Given these distinctive characteristics, assessing resistance biomarkers within Ngodhe island's P. falciparum population for the first time provided an invaluable baseline for drug resistance monitoring, as well as provided data that can be utilised to better inform policy making decisions concerning malaria control programmes within the region.
To preserve the efficacy of antimalarial drugs, monotherapy treatments (e.g. chloroquine) have been phased out for the treatment of uncomplicated malaria and replaced by ACTs in much of world, including Kenya 5,33 . Despite its discontinued use, chloroquine can still occasionally be found at local pharmacies in parts of Africa as a general treatment for fevers 1 . Monitoring chloroquine resistance mutations in parasite populations can provide insight into the presence of any ongoing drug selection pressure, which could be due to ongoing sales of chloroquine, as well as the return of chloroquine sensitivity, as seen in Malawi, following the complete removal of chloroquine from circulation 23 . The K76T mutation on Pfcrt, often used as the main molecular marker for chloroquine resistance, was observed to be at low frequencies in Ngodhe island parasites, with the wild-type K76 allele identified in most of the isolates screened through this study. The same was found to be true for other resistance conferring mutations on Pfcrt, with only wild-type alleles observed at codons 72 and 73, and high frequencies of wild-type alleles at codons 73 and 74. These results support the hypothesis that removal of chloroquine drug selection pressure may promote the return of chloroquine sensitive wild-type parasites [19][20][21][22] . Additionally, it has been documented that treatment with artemether-lumefantrine, extensively used in East Africa, selects for the chloroquine-susceptible Pfcrt K76 allele, suggesting that the K76T mutation may be a drug-specific contributor to enhanced P. falciparum susceptibility to lumefantrine 34,35 .
In accordance with WHO guidelines, SP continues to be used as IPTp in pregnant women but is not used as a first-line treatment for uncomplicated malaria 2 . Nonsynonymous polymorphisms on Pfdhfr and Pfdhps are associated with resistance to pyrimethamine and sulphadoxine, respectively. The degree of resistance to SP increases with each subsequent mutation on these genes, which has led to the emergence of quadruple, quintuple, and, more recently, sextuple mutations 25,36 . Resistance markers for SP were identified at high frequencies in Ngodhe island isolates, likely due to the continued drug selection pressure within the parasite population. The Pfdhfr/ Pfdhps quintuple mutant haplotype, encompassing the N51I, C59R, S108N, A437G and K540E mutations, was observed to be the most prominent haplotype, accounting for 62.2% of the screened population. Additionally, a sextuple mutant haplotype, including Pfdhfr/Pfdhps variants N51I, C59R, S108N, I164L, A437G and K540E, was identified in two isolates. Given the high frequencies of these SP resistance-associated biomarkers and continued circulation of SP in local parasite populations, there exists the possibility of relatively low fitness costs associated with maintaining these mutations.
The Pfdhps K540E variant is used as a proxy measure for the presence of all 5 key mutations associated with SP resistance on Pfdhfr and Pfdhps. However, following haplotype analysis, the K540E mutation was also observed in double and quadruple mutants at frequencies ranging from 2.7 to 5.4% 37,38 . In addition to being a proxy measure for SP resistance, the K540E mutation is also used to inform decision making by the WHO surrounding IPTp guidance, with IPTp implementation only recommended in regions where K540E prevalence is < 50% 39 . The K540E marker was identified in 79.2% of the Ngodhe island isolates, suggesting there may be reduced efficacy of IPTp-SP therapies within the region. The high frequency of this mutation highlights the need for continued monitoring of malaria infections in pregnant women to prevent adverse malaria-related outcomes, as well as the need for more safe and effective malaria drugs for use in pregnant women 40 .
There were no mutations observed on Pfk13 believed to confer a reduced susceptibility to artemisinin, however there were a handful of variants identified on Pfmdr1 associated with reduced susceptibility to the ACT partner drug lumefantrine. To prevent artemisinin resistance, ACTs combine artemisinin, a fast acting and highly effective antimalarial drug, with a partner drug that has a longer half-life to clear any remaining parasites 41,42 . Resistance to ACT partner drugs poses a threat to maintaining the efficacy of these combination therapies and the protection they provide against the emergence of artemisinin resistance. Artemether-lumefantrine is the ACT extensively used across much of Africa, including Kenya where it was introduced in 2006 27 . The Pfmdr1 Y184F and D1246Y variants, identified in 47.1% and 16% of isolates respectively, are associated with varying degrees of reduced susceptibility to lumefantrine and quinolines. Recent studies have found evidence that the Y184F mutant allele is not the primary determinant of resistance to lumefantrine but that there may be a genetic correlation between Y184F and the acquisition of a drug-resistance phenotype 26 . The N86Y variant, identified in 2% of isolates, is believed to be associated with changes to susceptibility of chloroquine and amodiaquine. www.nature.com/scientificreports/ The N86 wild-type codon, conversely, may confer a reduced susceptibility to lumefantrine and was identified in 98% of Ngodhe isolates 43 . There were several limitations with this study, especially related to the use of field isolates. The samples were collected as part of an MDA that occurred throughout 2015 and 2016. Due to the nature of malaria parasites and their ability to adapt quickly to selective pressures, drug resistance polymorphism prevalence may differ from this sample set and current day prevalence. However, there was previously no data available for this region and this study aimed to set a baseline for future studies, as well as demonstrate amplicon sequencing as a viable method for screening low-density malaria infections. Additionally, we would anticipate differences over time in the frequencies of molecular markers for drug resistance regardless of the 2015/2016 MDA as P. falciparum populations are always adapting to changes in drug pressures within the environment. Drug resistance linked to copy number variation, particularly relevant for Pfmdr1, was not addressed by this technique and leaves room for future work to address and supplement this methodology. However, amplification of Pfmdr1 is considered rare in East African parasite populations and is not believed to play a significant role in drug resistance in this region 44 . Finally, only 102 (of the 201) PCR-positive isolates were sequenced. The sequenced isolates had the highest DNA concentrations, and it is possible this could have led to some sampling bias and mutations could have been missed.
Progress towards global malaria eradication over the past few years has been hampered by the emergence and spread of resistance to antimalarial drugs and has highlighted the need for rapid and reliable methods to detect and monitor molecular markers of drug resistance. Here we applied amplicon-based approaches aimed at targeting known and putative drug resistance markers in established loci on Pfcrt, Pfmdr1, Pfdhps, Pfdhfr, and Pfk13 to demonstrate the effectiveness of sequencing-based high throughput surveillance in a low resource setting on asymptomatic, low-density, P. falciparum infections. The cost of sequencing has been prohibitive to the implementation of sequencing-based surveillance methods in low-resource settings. However, thanks to the development and application of dual-indexing technology, alongside targeted short-read sequencing, amplicon sequencing presents an affordable alternative to whole genome sequencing, as well as offers the potential for cross-platform capability, including both Illumina and Oxford Nanopore Technology sequencing platforms. In addition to more inclusive pricing (currently < USD 0.5 per amplicon), amplicon sequencing is easily expandable to include a wider range of targets across the Plasmodium genome, as well as targets within other species, suggesting the possibility of an integrated host-pathogen-vector surveillance system. Malaria species identification and parasite density assessment were carried out using microscopy, following WHO guidelines, at the Nagasaki University research station in Mbita, Kenya by trained microscopists and confirmed at Osaka Metropolitan University and LSHTM Malaria Reference Laboratory using established nested PCR assays 45,46 . Ngodhe island is occupied by the Luo ethnic group, migrant fishermen and families. The island is less than 1 km 2 in size and accessible only by small boat; located 3 km north of Rusinga island and the mainland town of Mbita. The use of long-lasting insecticide treated bed nets (LLINs) on the island increased between 2012 and 2015. This uptake was due to initiatives introduced by the Kenyan Ministry of Health to provide free LLINs for households and pregnant women. This increase in LLINs usage saw a decrease in malaria prevalence on the island during the 2012-2015 time-period. Indoor residual spraying (IRS) was not implemented on Ngodhe island until 2018.  (Table S1). Primers were designed to contain custom 6-nucleotide indices based on previously described and published methodologies by Nag et al. 27 . In this study, the Pfcrt primers were further optimised from those published by Nag et al. for use in Pf3D7, and field isolates, while the Pfmdr1 primer set was expanded to include the codon at position 184.

Methods
DNA extraction and sample preparation. DNA from isolates collected in years 2015 and 2016 was extracted in 2021 from dried-blood spots (DBS) preserved on filter papers. Filter papers were stored at 4 °C from 2015/2016 until blood extraction in 2021. A cold chain was not maintained during sample shipment from Kenya to the London School of Hygiene and Tropical Medicine. For DNA extraction, half of a DBS was used. DNA extraction was performed using the Qiagen QiaSymphony Automated Nucleic Acid Extraction Facility and the Qiagen QIAsymphony DSP DNA kit. Following extraction, DNA was amplified using an established selective whole genome amplification (SWGA) primer set and protocols 31 Amplicon purification and pooling. Indices for each amplicon target were conserved according to the sample identifier. Samples not containing shared index combinations were pooled (a maximum of 110 amplicons per pool was achieved) for purification (Fig. 1). Pooled samples were purified prior to sequencing using bead purification (KAPA), according to the manufacturer's instructions, using a ratio of 1:0.70 of product to bead volume to select for 400 to 500 bp segments of DNA. DNA concentration was measured using Qubit dsDNA HS (Invitrogen) and standardised to a final concentration of 20 ng per 25 µl.
Illumina sequencing and bioinformatics. P. falciparum isolates collected in 2015/2016 (n = 102) were sequenced using the Illumina MiSeq platform with 300 bp paired end kits at Genewiz (GENEWIZ Germany GmbH). The sequencing reads from pooled samples were demultiplexed, divided into separate files based on their unique indices, using an in-house python script to generate individual FASTQ files necessary for downstream analysis. Following demultiplexing, the raw sequencing data was then mapped to the Pf3D7 (P. falciparum) reference (v3) genome using bwa-mem software (default parameters for Illumina data) 48 . SNPs and short insertions and deletions (indels) were called using the samtools, freebayes, and GATK software suites [49][50][51] . The minimum base quality for gatk and freebayes was changed to 30 rather than their default parameters of 10 and 0, respectively. SNPs occurring in low quality or low coverage regions were discarded. Mixed call SNPs were assigned genotypes determined by a ratio of coverage in which nucleotide calls were 80% or higher. SNPs were annotated using bcftools consequence calling, which predicts functional variant consequences 49,52 .
Ethical approval and consent to participate. All experimental protocols and research were performed in accordance with relevant guidelines and regulations. Research was approved by the Mount Kenya University Independent Ethics Research Committee (MKU-IERC) (Approval reference: P609/10/2014) and the Ethics Committee at Osaka Metropolitan University (Approval number: 3206). Workshops and sensitisation meetings were carried out with communities to attain community consent to study participation. Written informed consent was obtained from all study participants whose parasite DNA was used in this study.

Data availability
All raw sequence data is available from the ENA (project accession number PRJEB58092).